A Linguistically-Based Segmentation of Complex Sentences
نویسندگان
چکیده
The paper describes a method of dividing complex sentences into segments, easily detectable and linguistically motivated units, which may provide a basis for further processing of complex sentences. The method has been developed for Czech as a language representing languages with relatively high degree of word-order freedom. The paper introduces important terms, describes a segmentation chart, the data structure used for the description of mutual relationship between individual segments and separators. It contains a simple set of rules applied for the segmentation of a small set of Czech sentences. The issues of segment annotation based on existing corpus are also mentioned.
منابع مشابه
Segmentation of Complex Sentences
The paper describes a method of dividing complex sentences into segments, easily detectable and linguistically motivated units that may be subsequently combined into clauses and thus provide a structure of a complex sentence with regard to the mutual relationship of individual clauses. The method has been developed for Czech as a language representing languages with relatively high degree of wo...
متن کاملAnnotation of sentence structure - Capturing the relationship between clauses in Czech sentences
The focus of this article is on the creation of a collection of sentences manually annotated with respect to their sentence structure. We show that the concept of linear segments—linguistically motivated units, which may be easily detected automatically—serves as a good basis for the identification of clauses in Czech. The segment annotation captures such relationships as subordination, coordin...
متن کاملAutomatic linguistic segmentation of conversational speech
As speech recognition moves toward more unconstrained domains such as conversational speech, we encounter a need to be able to segment (or resegment) waveforms and recognizer output into linguistically meaningful units, such a sentences. Toward this end, we present a simple automatic segmenter of transcripts based on N-gram language modeling. We also study the relevance of several word-level fe...
متن کاملIdentifying linguistic segmentations in Chinese spoken dialogue
In a continuous speech recognition system, a longer waveform is usually segmented into some shorter pieces based on simple acoustic criteria, such as unfilled pauses (i.e., silences). We call such a kind of segmentation as an acoustic segmentation. In general, the acoustic segmentations do not reflect the linguistic structure. They may fragment sentences or semantic units. Besides, they may als...
متن کاملThe Role of Self-Regulatory Approach in Iranian Learners' Lexical Segmentation: The case of authentic materials
The present research investigated the effect of self-regulatory approach (with two components of self-checking and self-efficacy) on pre-intermediate Iranian learners' lexical segmentation in listening comprehension via authentic listening comprehension texts. To achieve this purpose, the investigators administered an Oxford Placement Test (2007) to ninety-eight students of two girls’ private j...
متن کامل